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Abstract 

In this worlQ we study the problem of the optimal dissemination of Channel State Information 
(CSI) among K spatially distributed Transmitters (TXs) jointly cooperating to serve K Receivers (RXs). 
One of the particularities of this work comes from the fact that the CSI is distributed in the sense that 
each TX obtains its own estimate of the global multi-user MIMO channel with no further exchange of 
CSI being allowed between the TXs. This is especially adapted to model the cooperation between non- 
colocated TXs such as base stations in the downlink of network MIMO systems or other interference 
networks allowing joint processing. Our work is rooted in the pervasive intuition that the quality with 
which a channel element, relative to a given RX, is represented at a TX should be a function of the 
link strength between the RX and the TX. However, the analysis of the amount of CSI required in 
this setting and its spatial allocation throughout the network has not been addressed previously. We 
study this problem asymptotically in the Signal-to-Noise Ratio (SNR) by using the notion of generalized 
Degrees of Freedom (DoF). We propose a CSIT allocation, denoted as distance-based, which achieves 
the maximal generalized DoF. Remarkably, the number of feedback bits per TX for this particular 
allocation policy does not grow unbounded with the size of the network. This is in sharp contrast with 
the conventional (uniform) CSI dissemination policy. Consistent with the intuition, this DoF achieving 
CSI dissemination policy is shown to limit the cooperation between the TXs to a finite neighborhood 
around each TX. Hence, it can be seen as a more efficient alternative to classical network MIMO 
clustering. 



'This work has been performed under the Celtic-Plus project SHARING. Preliminary results have been published in 11 
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I. Introduction 

Network (or Multicell) MIMO methods, whereby mukiple interfering transmitters (TXs) share 
user messages and allow for joint precoding, are currently considered for next generation wireless 
networks [j2|. With perfect message and channel state information (CSI) sharing, the different TXs 
can be seen as a unique virtual multiple- antenna array serving all receivers (RXs), in a multiple- 
antenna broadcast channel (BC) fashion. However, the sharing of the user's data symbols and 
the CSI to all cooperating TXs imposes huge requirements on the architecture, particularly as 
the number of cooperating TXs increases. 

Consequently, the cooperation is usually limited to small cooperation clusters inside which 
the TXs cooperate. The optimal way of forming these clusters has recently become an active 
research topic [[3j|-[|6). Still, clustering leads to some fundamental limitations. Firstly, there is 
inevitably inter-cluster interference on the boundaries of the cluster and secondly, it requires 
the obtaining at all the TXs inside the cluster of the CSI relative to the entire cluster which 
means that the size of the CSI required increases very quickly with the number of TXs inside 
the cluster. Several works have focused on determining the optimal size of the clusters when 
taking into account the cost of estimating the channel elements [|7|-[|9|. They suggest that TX 
cooperation cannot efficiently manage interference, even if the backhaul links are strong enough 
to form large clusters. The main message behind [9] is that pilot-based channel estimates incurs 
a substantial loss when trying to learn the channel from a large number of users within a finite 
coherence time interval, causing the DoF to saturate. In this work, we focus instead on the 
downlink and we assume a system where the terminals have access to perfect knowledge of 
their channels and must convey this information via limited rate uplink feedback. 

We consider furthermore that clustering is not used but each TX is able to cooperate with 
any TX regardless of the distance. Each TX receives its own channel estimate of the multi-user 
channel and implements distributively its part of the joint precoder solely based on its own CSI. 
Hence, we focus here on the problem of CSI sharing between the TXs and not on the added 
effects of pilot-based channel estimation which will be the focus of subsequent work. Note that 
the optimization of the feedback allocation is the topic of many results, e.g. p0|-p6|. However, 
these papers consider the case where the user's data symbols are not shared between the TXs 
which yields a different problem statement. 
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The question which follows is to determine what is the right amount of CSI which should be 
made available to each TX so as to optimize an overall network utility subject to the practical 
constraints of the multi-user feedback channel. To tackle this question, we will consider a Degrees 
of Freedom (DoF) analysis. Although it offers a partial understanding of the problem (namely 
the large SNR behavior), it provides key insights in our scenario. In our previous work [17], the 
notion of DoF was already used in a distributed CSI setting, but in the limiting case where all 
the wireless links between a TX and a RX have the same average power, and the focus was on 
the derivation of robust precoders, not on the CSI allocation. 

In order to gain analytical tractability, we focus here on the particular case of so-called linear 
networks (or 1-D networks). Our contributions are as follows: 

• We propose a novel CSIT allocation policy, referred to as "Distance-based". 

• We establish that the new policy, combined with joint precoding of all user messages, leads 
to the maximum generalized DoF. 

• We characterize the feedback requirements scaling (in terms of number of users K) and 
show that the per-TX scaling is bounded. This is in sharp contrast with conventional CSIT 
allocation for which the feedback requirement grows unbounded with K. 

• We show that the number of data symbols necessary at one TX for achieving the maximal 
DoF also saturates as K increases. Hence, our CSI and data allocation policy appears as 
an attractive alternative to clustering. 

Notations: We denote by Cj the i-th column of the K x K identity matrix, and by 5i{j) the 
Kronecker symbol which is equal to 1 if j = i and to zero otherwise. The operator [•]+ takes 
the maximum between the real argument and 0, and [•] denotes the ceiling operator. |^| is 
used to denote the cardinality of the finite set A. The complex circularly symmetric Gaussian 
distribution with zero mean and variance af^ is represented by A/'(0, crf^,). The («j)-th elements 
of a matrix A is denoted equivalently as {A}ij or as Aij. 

II. System Model 

A. Network MIMO 

We consider a network MIMO setting in which K non-colocated transmitters (TXs) transmit 
jointly via linear precoding to K receivers (RXs) equipped with a single antenna and applying 
single user decoding. Each TX has the knowledge of the K data symbols to transmit to the K RXs 
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(owing to TX cooperation friendly routing protocol for user-plane data). The transmission is then 
described as 
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where y = [yi, . . . ,yK]'^ E C^^^ contains the received signals at the K RXs, hf E C^^^ is 
the channel to the i-th RX, rj = [qi, . . . , t]k]^ E C^^^ is the i.i.d. M{0, 1) normalized noise at 
the RXs, and x = [xi, . . . , xk]^ E C^^^ represents the transmit signals at the K TXs. We also 
define the multi-user channel H = [hi, ... , hx]^. Hik designates the fading coefficient between 
TX k and RX i. We consider a Rayleigh fast fading such that Hik ~ A/'(0, af^) where the value 
of afi^ will change depending on the geometry of the network. 

The transmit signal x is obtained from the user's data symbols s = [si, . . . ,sk]'^ E C^^^ 
(i.i.d. A/'(0, 1)) as 
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Hence, the vector tj E C represents the beamforming vector used to transmit Sj to RX i 
and we define as T = [ti, . . . E C^^^ the multi-user joint precoder. We consider a sum 
power constraint and an equal power allocation to the users, both for clarity and because it does 
not impact the DoF. The rate of user i averaged over the fading distribution as well as over the 
quantization error, is then written as 
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(3) 



where Evv;[«] denotes the average over the quantization at all the TXs. The Degrees of Freedom 
(DoF) at RX i is then defined as 
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(4) 



We then use the notion of generalized DoF [18|-[24| to represent the attenuation of the inter- 
ference in relation to the power regime. Hence, the generalized DoF at RX i is defined by 
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(5) 
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where the CSIT allocation B and the precoding used will be described in the following and the 
matrix F e M^^^^ is called the interference level matrix and can be written as 

Without loss of generality, we set in the following af^ = 1. To model the interference attenuation, 
we also have in this work that Vi ^ j, {T}ij < 1. 

B. Distributed CSI 

The joint precoder is implemented distributively at the TXs with each TX relying solely on 
its own estimate of the channel matrix in order to compute its transmit coefficient, without any 



exchange of information with the other TXs |17|, [25|. To model the imperfect CSI at the TX 



(CSIT), the channel estimate at each TX is assumed to be obtained from a limited rate digital 
feedback scheme. Consequently, we introduce the following definitions. 

Definition 1 (Distributed Finite-Rate CSIT). We represent a CSIT allocation by the matrix B G 
T^KxK ^ r/ze channel estimate of hi at TX j is then denoted by /ip^ and is selected from 



hf^ = argmin — it>||^. (7) 
where is a codebook containing 2^^^^ elements. 



Hence, TX j collects the multi-user channel estimate H*^-'^ = [h!f \ . . . , ] from the feed- 
back. Results on the quantization error resulting from this quantization scheme are provided in 
Appendix |lj 

It is a well known result that the number of CSI feedback bits should scale with the SNR 
in order to achieve a positive DoF [[TJ, p7| , p6| . Consequently, we define the size of a CSIT 



allocation as follows. 

Definition 2 (Size of a CSIT allocation). The size s(«) of a CSIT allocation B at TX j is defined 
as 

s(ejB) 4 lim (8) 

^ P^oo log2(P) 

such that the total size of a CSIT allocation is 

spK spK ^(j) 

A ^'=}^;z\ ' . (9) 

P^co log2(P) 
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Remark 1. We consider here a digital quantization of the channel vectors but the results can be 
easily translated to a setting where analog feedback is used [|7|, p7| . Furthermore, only CSI 
requirements at the TXs are investigated, and different scenarios can be envisaged for the sharing 
of the channel estimates (e.g., direct broadcasting from the RXs to all the TXs, sharing through 
a backhaul, ...) [28^1. □ 



C. Distributed Precoding 

Based on its individual CSI, each TX designs its transmit coefficients. In this work, we focus 
on the CSI dissemination problem under a conventional precoding framework. Note that the 
design of a precoding scheme that is optimally robust in the context of distributed CSI is a 
research topic in its own right, and a challenging one y/TJ. We assume that conventional Zero 
Forcing (ZF) is used since ZF is well known to achieve the maximal DoF in the MIMO BC with 
perfect CSIT [|7|, p6| . Furthermore, considering limited feedback in the compound MIMO BC, 
it is revealed in p9| that no other precoding scheme can achieve the maximal DoF with a lower 
feedback scaling. This confirms the efficiency of ZF in terms of DoF, even when confronted to 
imperfect CSI. 

Based on its own channel estimate H*^-'\ TX j computes the beamforming vector to 
transmit symbol Sj such that 

(i)A,/75_(HO))-'e. 



Vie{l,...,A'}, tj 



(10) 



II (H(^')) e,, 

Although a given TX j may compute the whole precoding matrix T^-'^ only the j-th row is 
of practical interest. Indeed, TX j transmits solely Xj = ejT^^^s. The effective multi-user 
precoder T verifies then 
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We denote by the superscript 



.PCSI 



the coefficients obtained when all the TXs have perfect CSI. 
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Remark 2. Each TX proceeds to the normalization of the beamformer independently and based on 
a-priori different channel estimates. Hence, the power constraint is only approximately fulfilled. 
Yet, the power constraint can easily be seen to be fulfilled in average over the estimation errors. 
Furthermore, the power constraint is also asymptotically exactly fulfilled as the accuracy of the 
CSI increases. As a consequence, we do not consider here the problem of imperfect fulfillment 
of the power constraint. □ 

D. Optimization of the CSIT allocation 

Optimizing directly the allocation of the number of bits at finite SNR represents a challenging 
problem which gives little hope for analytical results. Listead, we focus on the CSIT allocations 
achieving the maximal generalized DoF. 

Definition 3. We define the set of DoF -optimal CSIT allocations BDoF(r) as 

BDoF(r) ^ {B|Vz, DoF,(B, r) = 1}. (12) 

Hence, an interesting problem consists in finding the minimal CSIT allocation (where mini- 
mality refers to the size in Definition [2]) which achieves the maximal generalized DoF at every 
user: 

minimizcB s(B), subject to B G BDoF(r). (13) 

In this paper, we focus on an "achieveability"' result, by exhibiting a CSIT allocation that 
achieves the maximal DoF while having a much lower size than the conventional (uniform) 
CSIT allocation. The problem of finding a minimal-size allocation policy while guaranteeing 
full DoF (i.e. DoF equal to the perfect CSIT case) is an interesting problem, but an extreme 
challenging one, which, to our best knowledge, remains open. 

III. Preliminary Results 

As a preliminary step, we derive a sufficient criterion on the precoder for achieving the 
maximal DoF. 

Proposition 1. The maximal DoF is achieved by using the precoder T if the CSIT allocation 
matrix B is such that 



O ( ^ 1 . (14) 
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where we have defined the multi-user unitary precoder U = [wi, . . . , Uk\ with Ui = 

Proof: A detailed proof is provided in Appendix |llj ■ 

A. The Conventional CSIT allocation is DoF Achieving 

The conventional dissemination of the CSI corresponds to sharing to each TX the CSI relative 
to the full multi-user CSI, enabling all the TXs to do the same processing and compute a common 
unitary beamformer U*^-'^ = U. Hence, the condition of Proposition [l] can be rewritten as 



w 



F 



0{^]. (15) 



Accordingly, the following achieveability result is obtained. 

Proposition 2. Let us define the Conventional CSIT allocation B'^"'^^ G M:^^^ such that 

K 

V2,j,{B--},, = 5^1og2(P4). (16) 

k=l 

Then, B G Bdof- 



Proof: A detailed proof is provided in Appendix HI ■ 
This CSIT allocation provides to each TX the K channel estimates relative to the K RXs. 
This means that each TX requires a number of channel estimates growing unbounded with K, 
which represents a serious issue in large/dense networks. 

B. CSIT allocation with Distributed Precoding 

The conventional CSIT allocation being costly in terms of dissemination of the CSI, the focus 



of this work is on the derivation of a more efficient solution to the optimization problem (13). 



A crucial observation is that each TX does not need to compute accurately the full precoder. 



Indeed, the sufficient criterion ( [T4| ) can be written in the distributed CSI setting as 

Vj G {1, . . .,K},En,w [||eJ(U(^) - U^^^^)]!'] = O . (17) 

This leads to the fundamental question, which will be tackled in the following, to determine 
what channel coefficients should be known with which accuracy at TX j so as to fulfill the 



conditions in (17). 
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IV. Distance-Based CSIT allocation in the Wyner Model 



We start our analysis in the so-called linear Wyner model where K single-antenna TXs 
serve K single-antenna RXs with a RX receiving interference coming only from its two direct 
neighboring TXs pO| . Furthermore, the interferences are attenuated by an inter-cell attenuation 
equal to fi E (0, 1). Thanks to its simplicity, this model has already been successfully applied 
to quantify cooperation gains analytically in many recent works [31|-[35|. The multi-user 
channel matrix H G C^^^ is then a tridiagonal matrix equal to 
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(18) 



where hi, di, and a, are i.i.d. A/'(0, 1). The transmission is modeled in Fig. [Tj 

Following the generalized DoF model [ 18|, we furthermore assume that /i^ = P'^^^ where we 
define 7 as the interference level. This implies using the notations introduced earlier that 



0", 



1 if « = J 

^2 ^ P7-1 if z = J ± 1 

else 

such that the interference level matrix F is equal to 
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1 if 2 = J 

{r}ij = <; 7 if i = J ± 1 

else. 



(19) 



(20) 



A. CSIT allocation in the Wyner Model 



In this Wyner model, the conventional CSIT allocation in ( [T6| ) reads as 

Vz, J, {B™-},,- = riog2(P)l + 2[7log2(P)l. (21) 



10 




u aj.2j 



Fig. 1. Schematic representation of the transmission in the Wyner Model. 



We can then state one of the mains results. 

Theorem 1. Let us define the CSIT allocation B'^^'** such that 

Vz,j, {B'''^%=\[l + {^-l)\^-J\]+\og,{P)]+2\[^ + {^-l)\^-J\]+\og,{P)l (22) 
Then B'^''^ e BdoF- 



Proof: A detailed proof is provided in Appendix |IV 
It can be seen that B'^^'^' becomes equal to B' 



as 7 tends to 1 while the size of B'^"^* becomes 



7 

smaller as 7 decreases. Intuitively, this result is based on the off-diagonal exponential decrease 
of the channel inverse (precoding) coefficients [[T|, p6| . Thus, it offers a rigorous treatment of 
intuitive design idea by which a given TX contributes less power to the transmission of a stream 
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for a user located far away and hence requires only a coarse knowledge of the environment 
surrounding that user. In fact, if the user is located far enough, the TX contribution is negligible 
and can be set to zero without any DoF loss. This represents the basis for the following properties. 

B. Properties of the Distance-Based CSIT allocation 

An important aspect of any feedback allocation policy is related to how the feedback load 
scales as the number of cooperating users K increases. 

Corollary 1. The size of the distance-based CSIT allocation at any TX remains bounded as K 
increases: 

Vj G {1, . . . , K}, s(ejB<^^^') = 0(1) , as K increases. (23) 



Proof: This result follows trivially from ( [22] ) by observing that {B}jj is equal to if 
> 1/(1-7). ■ 
This result is in stark contrast with the conventional CSIT allocation which yields s(ejB'^°'^^) = 
0{K). An interesting question is whether this property of local CSI exchange extends to the 
sharing of the user's data symbols. 

Corollary 2. Let us denote by ICj the set of user's data symbols that TX j has the knowledge of. 
In the distance based CSIT allocation, it is sufficient for achieving the maximal DoF that Si G ICj 
if 

l + (7-l)K-j| >0. (24) 
In the Wyner model, it follows that for 7 < 1, 

Vj, \)Cj\ = 0{1) , as K grows. (25) 



Proof: It can be seen from the proof in Appendix IV that setting {tjjj to zero for any i,j 
verifying that 1 + (7 — l)|i — j| < 0, creates additional interference with power less than o(l/P) 
which are hence negligible in terms of DoF. ■ 
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The operational meaning of the above result is as follows. Define 



1-7 



(26) 



then no exchange of information (CSI or data symbol) is necessary between two TXs separated 
by at least Kq TXs in the linear network. Intuitively, Kq is the size of the neighborhood inside 
which the cooperation should occur. Altogether, the distance-based CSIT allocation and the user's 
data sharing solution provide an attractive alternative to clustering. Indeed, as clustering, this 
cooperation scheme restricts the sharing of the CSI and data symbols to a finite neighborhood 
around each TX. The difference being that the hard-boundaries of the cluster are replaced by a 
smooth decrease of the amount of cooperation. 



V. Distance-Based CSIT allocation in Exponentially Decaying Channels 

The Wyner model represents an interesting model for the insight that can be obtained analyt- 
ically. It is nevertheless a very simplified modeling of the transmission with the main limiting 
assumption being that only the direct neighbors interfere. We consider in the following a more 
general channel model lifting this restriction. Hence, all the links will be assumed to be non-zero 
and the channel matrix to have an exponentially decaying structure. This means that the amplitude 
of the (z, j)-th element of the channel matrix decreases exponentially with the difference \i— j\. 

In the following, we consider that the channel matrix H is made from the element-wise product 
of a standard Rayleigh fading channel matrix G e C^^^ and a Toeplitz matrix generated from 
the vector [1, //^, . . . , 
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(27) 



Similarly to the Wyner model, the parameter ji is defined such that ji — P'^~^ with 7 e (0, 1) 
being the interference level. Using the notations introduced earlier, we then have cr? = = 
p(7-i)l»-j| It follows that the interference level matrix T is equal to 

Vi,j, {r},,- = l + (7-l)|i-j|. (28) 
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A. CSIT allocation in Exponentially Decaying Channels 



In that model, the conventional CSIT allocation in < [T6| ) is equal to 

K 

V^,j, {B— = + (7 - 1)K - A:|]+log2(P)l. (29) 

k=l 

We will now show how the results of the Wyner model extend to this setting. 
Theorem 2. Let us define the CSIT allocation B'^'*'* such that 

K 

Ml, J, {B'i'^n.. = + - 1)1^ - ^1 + (7 - 1)K - j|]+log2(P)l. (30) 

k=l 

Then B'^'"* G BdoF- 

Proof: A detailed proof if provided in Appendix |Vj ■ 
Letting 7 tend to one inside ([30]), the distance-based CSIT allocation converges then to the 



conventional CSIT allocation ([29]). Similar to the Wyner case, the intuition is that the distance- 
based CSIT allocation exploits the exponential decrease of the channel inverse. This can be 
somehow related to the the space of infinite exponentially decaying matrices being closed under 



inversion |37|-[40| 



B. Properties of the Distance-Based CSIT allocation 

The distance-based CSIT allocation achieves the maximal DoF but a critical question is to 
determine whether the properties of cooperation at the local scale of the distance-based CSIT 
allocation are a consequence of the interference being limited to the direct neighbors in the 
Wyner model, or are more fundamental properties which hold in more general scenarios. 

Corollary 3. The size of the distance-based CSIT allocation at any TX remains bounded as K 
increases: 

Vj G {1, . . . , K}, s(ejB<^''*) = 0(1) as K grows. (31) 

Proof: The proof follows easily using the same approach as in the Wyner case. ■ 
Furthermore, the sharing of the user's data symbols can also be restricted to a finite neigh- 
borhood without reduction of the DoF. 
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Corollary 4. In the distance based CSIT allocation, it is sufficient for achieving the maximal 
DoF that Si G ICj if 

l + (7-l)|2-j| >0. (32) 
In the exponentially decaying model, it follows that for 7 < 1, 

Vj, \lCj\ = 0{1) , as K grows. (33) 



Proof: The proof follows easily using the same approach as in the Wyner case. ■ 
Hence, the appealing properties relative to both the size of the CSIT allocation and the 
data symbols sharing are not restricted to the Wyner model but extend to the more general 
exponentially decaying settings. The geometry of the network in this model remains simple with 
a pathloss assumed to be regular across the TXs, and the TXs being located in a ID-network. 

Theses assumptions are used in order to be able to obtain analytical results, but the approach 
which consists in restricting the cooperation at a local scale by exploiting the pathloss attenuation, 
is much more general and is believed to have a strong potential for improving the performances 



in realistic networks. The possible extensions are further discussed in Section VII 



VI. Simulations 

The simulations results for the Wyner setting and the exponentially decaying settings are very 
similar such that only simulations results for the more general exponentially decaying model 
are shown. In a first step, we verify by simulations that the maximal DoF per user is achieved 
by the distance based CSIT allocation. At the same time, we compare the distance based CSIT 
allocation to the CSI dissemination commonly used. 

Hence, the average rate per user achieved as a function of the SNR for different CSIT allocation 
policies is shown in Fig. [2j Specifically, the distance-based CSIT allocation in ([30]) is compared 



to two alternative CSIT allocations having the same total size. In the first one, the CSI bits 
are allocated uniformly to the TXs while conventional (non- overlapping) clustering of size 3 



is used for the second. We consider an exponentially decaying channel as given in ( [27| ) with 
K = TXs and 7 = 0.6. We use Monte-Carlo averaging over 1000 channel realizations. 

With these parameters, the size of the distance based CSIT allocation is only equal to 10% 
of the size of the conventional CSIT allocation. It can be seen that the distance based CSI 
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SNR [dB] 



Fig. 2. Average rate per user as a function of the SNR P for K = 15 and 7 = 0.6. 



dissemination achieves the maximal slope while the clustering solution has a smaller slope, 
yet larger than the uniform CSIT allocation. The significant negative rate offset comes from 
considering only the interference asymptotically in the SNR. This offset can easily be reduced 
by taking other parameters into account, as the number of significant interferers, and optimize 
the CSIT allocation at finite SNR to ensure that the interference received are sufficiently smaller 
than the noise. 

In Fig.[3| we compare the size of the distance based CSIT allocation [Cf. ([30])] with the size of 



the conventional one [Cf. ([29|)]. As expected, the size of the conventional CSIT allocation scales 
linearly with K while the size of the distance-based CSIT allocation saturates as K increases. 
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Fig. 3. Size of the CSFT allocations in terms of the number of users for 7 = 0.75. 



VII. Discussion of the Results And Extension to 2-D Networks 

In this work, we have studied the allocation of the CSI in a distributed CSI setting where 
K TXs jointly transmit to K RXs with each TX having its own channel estimate. Considering 
two pathloss geometries of ID-networks, the so-called Wyner model, and the exponentially 
decaying channels, we have derived a CSIT allocation achieving the maximal generalized DoF 
with only a fraction of the CSIT allocation conventionally used. Interestingly, the number of 
CSI bits provided to each TX does not scale with the size of the network. Furthermore, the 
sharing of the user's data symbols is also restricted to a finite number of TXs, meaning that 
our approach appears as a more efficient alternative to clustering where the level of cooperation 
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decreases smoothly with the distance between the TXs and the RXs instead of the hard cluster 
boundaries. 

The models studied in this work contain two main limitations. First, only exponentially 
decaying channels have been studied and second, the TXs are assumed to be placed on a one- 
dimensional space. 

The assumption of exponentially decaying channel is in fact solely a consequence of the 
DoF analysis. Indeed, the theoretical foundation for our approach comes from the fact that 
exponentially decaying channels are closed under inversion such that the channel inverse (the 
precoder) is also exponentially decaying. This also holds for polynomially decaying channels. 



which model more accurately the free-space pathloss, and sub-exponential channels [37|-[40|. 
However, when considering DoF, sub-exponential attenuation becomes negligible and the DoF 
analysis is not adapted. In practical scenarios with finite SNR, the extension to other pathloss 
models is thus expected to hold. 

Extending the analytical results to planar networks is more challenging. Yet, the lessons learned 
are expected to remain valid and to lead to either heuristic CSIT allocations or algorithms to 
allocate the CSI. Indeed, our approach is based on the attenuation of the signal strength with 
the distance and it can for example be adapted by considering the distance between the TX/RX 
pairs (considering that the TX is close to its paired RX) in the CSIT allocation formula. Hence, 
adapting the CSIT allocation with the distance appears to have a strong potential to lessen the 
cost of TX cooperation in practical wireless networks. 

Appendix I 
Some Results on the Quantization 

Since each TX has a different channel estimate, the quantization over the Grassmannian space 
which is commonly used in the MEMO BC [j26| (i.e., based on choosing the unitary vector w to 
quantize if it maximizes is not practical. Indeed, the objective is invariant by 

multiplication with any unit-norm complex number, which leads to inconsistencies between the 
channel estimates at the different TXs, thereby making the feedback obtained essentially useless 
for joint precoding. 
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Consequently, we prefer instead the quantization scheme (Cf. p7| for more explanations) 

argmin — it>||^. (34) 



h 



(i) 



We will assume that an optimal codebook is used such that we can use the following classical 
result from Rate-Distortion theory. : 

Theorem 13.3.3, /pTj/. Let Xi be i.i.d. A/'(0, o-f),i = 1, . . . , m, and let the distortion measure be 



p. Then the rate distortion function is given by 



/ 2 \ 



where 



A, if A < al 



(35) 



(36) 



where A is chosen so that Xll^i — 

We assume in this work that an optimal Gaussian quantization is used such that the estimation 
error can be obtained using ( |36l ). Note that only the scaling in the SNR P is relevant such that 
using instead suboptimal scalar quantization schemes (e.g., scalar uniform quantization) would 
not change the results. 



Appendix II 
Proof of Proposition [T] 

Proof: We start by defining the rate difference Ar ^ between the rate of user i based on 
perfect CSI and the rate achieved with limited feedback. As in ||7|, p6| , we can then write 

PI^H 12 

PCSI|2\1 t:. i„„ ( 1 , U-i 



AR,i=EH,w[log2(l+P|^rii: 



-E 



logs 1 + 



<Eh 



w 



, , l+P|^f<CSI|2 
log? Tr-^ — 



+E 



(37) 
(38) 
(39) 
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where ([39]) is obtained from the fact that Ui and u^'^^^ are isotropically distributed in the channel 
nuUspace. We further obtain 



(40) 
(41) 
(42) 



We can then easily upper-bound (42) to write 



log2 ll+P\\hifJ2\\'^^ 



U 



PCSI||2 



< Eh,w [log2 (1 + P||U - UP^s^ll^)] + Eh,w [log2 (1 + ll^if )] 

< log2 (PEh,w [||U - UP^s^ll^]) + Eh,w [log2 (1 + ||^.ir)] . 



(43) 

(44) 
(45) 



The maximal DoF is achieved if the rate difference Ar ^ remains bounded as the SNR increases. 



From (|45j), this can be seen to be verified if the condition of the proposition holds, hence 
concluding the proof. ■ 

Appendix III 
Proof of Proposition [2] 

Proof: We consider without loss of generality the precoding at TX j and we define the 
channel estimation error at TX j as N'^-'^ = H'^-'^ — H. We further assume that all the channel 
elements are known with the same accuracy. Hence, we write N'^-'^ = cr'^-'^N*^-'^ where N'^-'^ has 
its elements i.i.d. A/^(0, 1). We start by recalling the well known resolvent equality. 

Proposition 3 (Resolvent equality). Let A G C"^" and B G C"^" be two imertible matrices, 
it then holds that 

A"^ - B"^ = B-^(B - A)A-\ (46) 
Using the resolvent equality, we can then write 

(H(^))"^ - = -H-^N^^^H-i + (H(-'))-1n(^)H-1n(^)H-^ (47) 
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We can then use the triangular inequality to obtain the upperbound 

II (h(^'))"' - H-^F < a^^'^llH-illlN^^'^llF + (^(^■))2||(H(^'))-iF||H-i||2 ||nW|||. (48) 

Since the channel is well conditioned (thanks to the interference attenuation), its eigenvalues are 
bounded away from zero and the expectations of the norms in ( |48| ) remain bounded. Hence, it 
follows that it is sufficient to have (a^^'^y = 0(1/ P) to fulfill the condition given in Proposition [T| 
Considering the quantization scheme described in Subsection II-B and Appendix ^ it follows 



that using log2(Pcr^j) bits to quantize the channel element Hik (where E[|Hjfcp] = cx^-) leads to 
an estimation error with a variance equal to l/P. Summing over all the channel elements gives 
the CSIT allocation given in the proposition. ■ 

Appendix IV 
Proof of Theorem [H 

In the preliminary results presented in [[T|, the proof was based on the fact that a closed form 
formula exists for the inverse of a tridiagonal matrix (e.g. in p6|). This formula was then used to 
study the impact of imperfect CSI on the ZF precoder. We use here another approach where we 
write the inverse of the channel matrix as an infinite summation. This method has the advantage 
of being both more simple and more general. 

Proof: Let us focus without loss of generality on the CSIT allocation at TX j. It is shown in 
Proposition [T] that a sufficient criterion for achieving the maximal DoF is to have every element 
of the j-th row of (H(j))-i - H-i goes to zero as 0(1/P). 

We start be defining the matrix D = diag(H). We then define X = D — H = 0(/i) << 1, 
and X = yU^^X. Consequently, the matrix H is known to have an inverse which can be written 
such that 

oo oo 

Vz,j, ejH-'ei = ej J](D-^(D - H))^D-^e, = ^ /eJ(D-i)'=(X)^D-ie, . (49) 



k=0 k=0 



The channel inverse is written in (49) as a polynomial in fi and we will now identify the 



coefficients of this polynomial. Specifically, for each channel element, we will find what is the 

^Note that the condition is in fact on the unitary beamfoiiners precoders U'-'\ Yet, it can be easily seen that fulfilling this 
condition for the channel inverse ensures that it is also fulfilled by the unitary beamformers. 
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smallest order in /i where it arises. This will then directly determine the necessary accuracy 
with which this coefficient should be known. Hence, the proof is divided in two parts, where 
the first part consists in finding this smallest order in ji for every coefficient and the second part 
consists simply in using the results on the quantization in Appendix |I] to obtain the sufficient 
CSIT allocation. 

1) We denote by C^^ the coefficient corresponding to the fcth element in the summation of ( |49l ) 
which corresponds also to the sum of the terms in We then have 

c^; = (50) 

and 

The coefficient dj appears as a zero-th order term for i = j while dj^i, dj+i, bj and aj 
appear linear in /i for z = j + 1 or/and i = j — 1. Going further, 

i ' k=i 

1 



d^-di 



d^.di 

(52) 

Hence, dj^2, and aj_i appear with a coefficient fi^. We will now show that 

the approach used for the coefficients C^, C{\ and C2 can be generalized to any coeffi- 



cient C^^. From (|49]), we can write 

^ X K 

= -^jkj X] ■ ■ • XjM^kiM ■ ■ ■ , -^fc„-i,. (53) 

J * ki=l fc„_i = l 

K 

= ^ {Hjj+iSj+i{ki) + Hjj^i6j-i{ki)){Hk^^ki+iSki+iiki) + Hk^^ki-i5ki-iik2)) 

fci=l,A:„_i=l 

■ ■ ■ , (-f^fc„_i,fc„_i + l4„_i+l(0 + -f^fc„_i,fc„_i-l4„_i-l(^))^^ (54) 



It can be seen in ( [54| ) that C^* contains all the elements such that the ki verify that 
= 0, . . . , n — 1, fcj+i = /cj + 1 or fcj+i = ki where ko = j and k^ = i. With words, C^* 
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Fig. 4. The coefficient Hkt with k ^ £ appears in C'^ (with the coefficient /^") if there exists a path from TX i to RX j 
going through this channel element. The accuracy relative to the direct channels can be obtained by considering "paths" starting 
from the RXs. 



contains all the chains going from TX i to RX j, without forgetting the multiplication by 
the direct channel of TX i. This property is illustrated in Fig. |4j 

Going through all the Cl\ the smallest exponent in /i with which the channel coefficient 
from TX k to RX i appears is obtained by finding the "most direct" path from TX i to 
RX j going through this coefficient. It can be easily seen by considering this condition 
for every i that the smallest exponent in /i with which the coefficients and bi arise 
is \i — j \ + 1 while dt is multiplied with ^ at the exponent \i — j|. 
2) Using Appendix [l| we obtain that it is sufficient to use log2(yU^''P) bits to quantize x with 
an error scaling in 0(1/ P) if x appears weighted with /i^*^. Since /x^ = P'^~^, the CSIT 
allocation given in ([22]) allows each TX to compute its ZF precoding coefficients up to an 
error scaling in 0(1/P). From Proposition [T| the maximal DoF is then achieved. 
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Appendix V 
Proof of Theorem [2] 



Proof: This proof relies on the same idea as for the Wyner case in Appendix IV so that we 
will go only through the main steps. First, the channel inverse is written using the von Neumann 
series. Using this expression, it will be shown that it is possible to find the smallest exponent 
in jjL with which any channel coefficient appears in the expression of the inverse. We keep the 
same definitions for D = diag(H) and X = D — H as in the Wyner case. We also denote by 

'k 



Cr, k = 0, ... ,00 the elements in the series von Neumann. It then holds 



and 



Ci' = jS,{t) (55) 

which is equal to ij}''^^^Gji if i ^ j and otherwise. Going directly to an arbitrary value of k 
in the von Neumann summation, we write 

11^' ^' 

3 * fci=l fc,j_i=l 
^ ^ K K K 

3 * ki=j,kij^k2 k2y^ki fc„_i=l,fc„_i7^i 

(57) 

In contrast to the Wyner case, C^^ contains terms with different exponent in /i. Yet, in a similar 
way as in the Wyner case, they correspond to "chains" going from i to j where the number of 
"steps" corresponds to the exponent in ji. Hence, considering all the streams i, we obtain that 
the smallest exponent of fi with which the coefficient ^ appears in the jth row of the channel 
inverse is \i — j\ + \i — k\. The proof concludes then as for the Wyner case. ■ 
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